Overview

Dataset statistics

Number of variables20
Number of observations4888
Missing cells1012
Missing cells (%)1.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory763.9 KiB
Average record size in memory160.0 B

Variable types

CAT9
NUM8
BOOL3

Warnings

Designation is highly correlated with ProductPitchedHigh correlation
ProductPitched is highly correlated with DesignationHigh correlation
Age has 226 (4.6%) missing values Missing
DurationOfPitch has 251 (5.1%) missing values Missing
NumberOfTrips has 140 (2.9%) missing values Missing
NumberOfChildrenVisiting has 66 (1.4%) missing values Missing
MonthlyIncome has 233 (4.8%) missing values Missing
CustomerID has unique values Unique

Reproduction

Analysis started2022-09-24 17:26:26.441955
Analysis finished2022-09-24 17:26:34.969650
Duration8.53 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

CustomerID
Real number (ℝ≥0)

UNIQUE

Distinct4888
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202443.5
Minimum200000
Maximum204887
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:35.072613image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum200000
5-th percentile200244.35
Q1201221.75
median202443.5
Q3203665.25
95-th percentile204642.65
Maximum204887
Range4887
Interquartile range (IQR)2443.5

Descriptive statistics

Standard deviation1411.188388
Coefficient of variation (CV)0.006970776479
Kurtosis-1.2
Mean202443.5
Median Absolute Deviation (MAD)1222
Skewness0
Sum989543828
Variance1991452.667
MonotocityStrictly increasing
2022-09-24T13:26:35.215848image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2007021< 0.1%
 
2014791< 0.1%
 
2035141< 0.1%
 
2014671< 0.1%
 
2035181< 0.1%
 
2014711< 0.1%
 
2035221< 0.1%
 
2014751< 0.1%
 
2035261< 0.1%
 
2035301< 0.1%
 
Other values (4878)487899.8%
 
ValueCountFrequency (%) 
2000001< 0.1%
 
2000011< 0.1%
 
2000021< 0.1%
 
2000031< 0.1%
 
2000041< 0.1%
 
ValueCountFrequency (%) 
2048871< 0.1%
 
2048861< 0.1%
 
2048851< 0.1%
 
2048841< 0.1%
 
2048831< 0.1%
 

ProdTaken
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
0
3968 
1
920 
ValueCountFrequency (%) 
0396881.2%
 
192018.8%
 
2022-09-24T13:26:35.306487image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Age
Real number (ℝ≥0)

MISSING

Distinct44
Distinct (%)0.9%
Missing226
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean37.62226512
Minimum18
Maximum61
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:35.382236image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile24
Q131
median36
Q344
95-th percentile55
Maximum61
Range43
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.316387033
Coefficient of variation (CV)0.2476296151
Kurtosis-0.4513319674
Mean37.62226512
Median Absolute Deviation (MAD)6
Skewness0.3829886837
Sum175395
Variance86.79506734
MonotocityNot monotonic
2022-09-24T13:26:35.505409image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%) 
352374.8%
 
362314.7%
 
342114.3%
 
312034.2%
 
301994.1%
 
321974.0%
 
331893.9%
 
371853.8%
 
291783.6%
 
381763.6%
 
Other values (34)265654.3%
 
(Missing)2264.6%
 
ValueCountFrequency (%) 
18140.3%
 
19320.7%
 
20380.8%
 
21410.8%
 
22460.9%
 
ValueCountFrequency (%) 
6190.2%
 
60290.6%
 
59440.9%
 
58310.6%
 
57290.6%
 

TypeofContact
Categorical

Distinct2
Distinct (%)< 0.1%
Missing25
Missing (%)0.5%
Memory size38.2 KiB
Self Enquiry
3444 
Company Invited
1419 
ValueCountFrequency (%) 
Self Enquiry344470.5%
 
Company Invited141929.0%
 
(Missing)250.5%
 
2022-09-24T13:26:35.604718image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:35.666503image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:35.730289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length12
Mean length12.82487725
Min length3

CityTier
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
1
3190 
3
1500 
2
 
198
ValueCountFrequency (%) 
1319065.3%
 
3150030.7%
 
21984.1%
 
2022-09-24T13:26:35.817575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:35.877384image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:35.941123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

DurationOfPitch
Real number (ℝ≥0)

MISSING

Distinct34
Distinct (%)0.7%
Missing251
Missing (%)5.1%
Infinite0
Infinite (%)0.0%
Mean15.49083459
Minimum5
Maximum127
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:36.025738image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile6
Q19
median13
Q320
95-th percentile32
Maximum127
Range122
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.519642589
Coefficient of variation (CV)0.5499795727
Kurtosis11.79749394
Mean15.49083459
Median Absolute Deviation (MAD)5
Skewness1.752037049
Sum71831
Variance72.58430985
MonotocityNot monotonic
2022-09-24T13:26:36.130097image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%) 
94839.9%
 
73427.0%
 
83336.8%
 
63076.3%
 
162745.6%
 
152695.5%
 
142535.2%
 
102445.0%
 
132234.6%
 
112054.2%
 
Other values (24)170434.9%
 
(Missing)2515.1%
 
ValueCountFrequency (%) 
560.1%
 
63076.3%
 
73427.0%
 
83336.8%
 
94839.9%
 
ValueCountFrequency (%) 
1271< 0.1%
 
1261< 0.1%
 
36440.9%
 
35661.4%
 
34501.0%
 

Occupation
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
Salaried
2368 
Small Business
2084 
Large Business
434 
Free Lancer
 
2
ValueCountFrequency (%) 
Salaried236848.4%
 
Small Business208442.6%
 
Large Business4348.9%
 
Free Lancer2< 0.1%
 
2022-09-24T13:26:36.231115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:36.290985image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:36.374555image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length14
Median length14
Mean length11.09206219
Min length8

Gender
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
Male
2916 
Female
1817 
Fe Male
 
155
ValueCountFrequency (%) 
Male291659.7%
 
Female181737.2%
 
Fe Male1553.2%
 
2022-09-24T13:26:36.475526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:36.545360image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:36.855350image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length4
Mean length4.838584288
Min length4

NumberOfPersonVisiting
Real number (ℝ≥0)

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.90507365
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:36.949592image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median3
Q33
95-th percentile4
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.724890595
Coefficient of variation (CV)0.2495257203
Kurtosis-0.7774673393
Mean2.90507365
Median Absolute Deviation (MAD)1
Skewness0.02981670374
Sum14200
Variance0.5254663748
MonotocityNot monotonic
2022-09-24T13:26:37.030162image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
3240249.1%
 
2141829.0%
 
4102621.0%
 
1390.8%
 
530.1%
 
ValueCountFrequency (%) 
1390.8%
 
2141829.0%
 
3240249.1%
 
4102621.0%
 
530.1%
 
ValueCountFrequency (%) 
530.1%
 
4102621.0%
 
3240249.1%
 
2141829.0%
 
1390.8%
 

NumberOfFollowups
Real number (ℝ≥0)

Distinct6
Distinct (%)0.1%
Missing45
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean3.708445179
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:37.122858image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q34
95-th percentile5
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.002508686
Coefficient of variation (CV)0.2703312677
Kurtosis0.6203311898
Mean3.708445179
Median Absolute Deviation (MAD)1
Skewness-0.3727193989
Sum17960
Variance1.005023666
MonotocityNot monotonic
2022-09-24T13:26:37.224725image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
4206842.3%
 
3146630.0%
 
576815.7%
 
22294.7%
 
11763.6%
 
61362.8%
 
(Missing)450.9%
 
ValueCountFrequency (%) 
11763.6%
 
22294.7%
 
3146630.0%
 
4206842.3%
 
576815.7%
 
ValueCountFrequency (%) 
61362.8%
 
576815.7%
 
4206842.3%
 
3146630.0%
 
22294.7%
 

ProductPitched
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
Basic
1842 
Deluxe
1732 
Standard
742 
Super Deluxe
342 
King
230 
ValueCountFrequency (%) 
Basic184237.7%
 
Deluxe173235.4%
 
Standard74215.2%
 
Super Deluxe3427.0%
 
King2304.7%
 
2022-09-24T13:26:37.329768image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:37.404519image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:37.490019image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length6
Mean length6.252454992
Min length4
Distinct3
Distinct (%)0.1%
Missing26
Missing (%)0.5%
Memory size38.2 KiB
3
2993 
5
956 
4
913 
ValueCountFrequency (%) 
3299361.2%
 
595619.6%
 
491318.7%
 
(Missing)260.5%
 
2022-09-24T13:26:37.588604image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:37.657859image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:37.719730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

MaritalStatus
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
Married
2340 
Divorced
950 
Single
916 
Unmarried
682 
ValueCountFrequency (%) 
Married234047.9%
 
Divorced95019.4%
 
Single91618.7%
 
Unmarried68214.0%
 
2022-09-24T13:26:37.804845image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:37.885171image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:37.968172image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length9
Median length7
Mean length7.286006547
Min length6

NumberOfTrips
Real number (ℝ≥0)

MISSING

Distinct12
Distinct (%)0.3%
Missing140
Missing (%)2.9%
Infinite0
Infinite (%)0.0%
Mean3.23652064
Minimum1
Maximum22
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:38.050792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile7
Maximum22
Range21
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.84901931
Coefficient of variation (CV)0.5712984761
Kurtosis6.0990233
Mean3.23652064
Median Absolute Deviation (MAD)1
Skewness1.453883784
Sum15367
Variance3.418872408
MonotocityNot monotonic
2022-09-24T13:26:38.136245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%) 
2146430.0%
 
3107922.1%
 
162012.7%
 
44789.8%
 
54589.4%
 
63226.6%
 
72184.5%
 
81052.1%
 
211< 0.1%
 
191< 0.1%
 
Other values (2)2< 0.1%
 
(Missing)1402.9%
 
ValueCountFrequency (%) 
162012.7%
 
2146430.0%
 
3107922.1%
 
44789.8%
 
54589.4%
 
ValueCountFrequency (%) 
221< 0.1%
 
211< 0.1%
 
201< 0.1%
 
191< 0.1%
 
81052.1%
 

Passport
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
0
3466 
1
1422 
ValueCountFrequency (%) 
0346670.9%
 
1142229.1%
 
2022-09-24T13:26:38.195710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

PitchSatisfactionScore
Real number (ℝ≥0)

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.078150573
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:38.244547image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.365791728
Coefficient of variation (CV)0.4437053014
Kurtosis-1.102869771
Mean3.078150573
Median Absolute Deviation (MAD)1
Skewness-0.1277255598
Sum15046
Variance1.865387043
MonotocityNot monotonic
2022-09-24T13:26:38.326154image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
3147830.2%
 
597019.8%
 
194219.3%
 
491218.7%
 
258612.0%
 
ValueCountFrequency (%) 
194219.3%
 
258612.0%
 
3147830.2%
 
491218.7%
 
597019.8%
 
ValueCountFrequency (%) 
597019.8%
 
491218.7%
 
3147830.2%
 
258612.0%
 
194219.3%
 

OwnCar
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
1
3032 
0
1856 
ValueCountFrequency (%) 
1303262.0%
 
0185638.0%
 
2022-09-24T13:26:38.397400image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

NumberOfChildrenVisiting
Categorical

MISSING

Distinct4
Distinct (%)0.1%
Missing66
Missing (%)1.4%
Memory size38.2 KiB
1
2080 
2
1335 
0
1082 
3
325 
ValueCountFrequency (%) 
1208042.6%
 
2133527.3%
 
0108222.1%
 
33256.6%
 
(Missing)661.4%
 
2022-09-24T13:26:38.477073image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:38.550498image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:38.623509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

Designation
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
Executive
1842 
Manager
1732 
Senior Manager
742 
AVP
342 
VP
230 
ValueCountFrequency (%) 
Executive184237.7%
 
Manager173235.4%
 
Senior Manager74215.2%
 
AVP3427.0%
 
VP2304.7%
 
2022-09-24T13:26:38.722011image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-09-24T13:26:38.793912image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:38.884342image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length14
Median length9
Mean length8.301145663
Min length2

MonthlyIncome
Real number (ℝ≥0)

MISSING

Distinct2475
Distinct (%)53.2%
Missing233
Missing (%)4.8%
Infinite0
Infinite (%)0.0%
Mean23619.85349
Minimum1000
Maximum98678
Zeros0
Zeros (%)0.0%
Memory size38.2 KiB
2022-09-24T13:26:39.011552image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile17295.1
Q120346
median22347
Q325571
95-th percentile34723.9
Maximum98678
Range97678
Interquartile range (IQR)5225

Descriptive statistics

Standard deviation5380.698361
Coefficient of variation (CV)0.2278040532
Kurtosis14.8440669
Mean23619.85349
Median Absolute Deviation (MAD)2603
Skewness1.949159832
Sum109950418
Variance28951914.85
MonotocityNot monotonic
2022-09-24T13:26:39.128173image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2085570.1%
 
1734270.1%
 
2128870.1%
 
2102070.1%
 
2548260.1%
 
2495060.1%
 
2502560.1%
 
2213060.1%
 
2141960.1%
 
2123760.1%
 
Other values (2465)459193.9%
 
(Missing)2334.8%
 
ValueCountFrequency (%) 
10001< 0.1%
 
46781< 0.1%
 
160092< 0.1%
 
160512< 0.1%
 
160522< 0.1%
 
ValueCountFrequency (%) 
986781< 0.1%
 
950001< 0.1%
 
386772< 0.1%
 
386512< 0.1%
 
386212< 0.1%
 

Interactions

2022-09-24T13:26:27.566062image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:27.664429image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:27.739844image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:27.830363image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:27.925674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.016451image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.105586image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.200737image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.292447image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.378805image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.459787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.546263image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.636409image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.711193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.790066image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.874419image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:28.957652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.048730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.136627image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.228609image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.322029image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.414031image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.500494image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.584455image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.664261image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.769001image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.862491image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:29.958879image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.071063image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.180181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.264004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.369787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.469375image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.571497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.651509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.741583image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.833611image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.910183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:30.992572image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:31.080457image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:31.169440image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:31.690021image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:31.797316image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:31.878450image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:31.959525image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.058891image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.142873image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.231631image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.318340image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.424338image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.526995image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.638193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.748144image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.855585image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:32.958629image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.073608image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.180486image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.281650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.379355image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.477467image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.583892image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.681903image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.776411image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:33.882271image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-09-24T13:26:39.236486image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-24T13:26:39.439320image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-24T13:26:39.640699image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-24T13:26:39.850456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-09-24T13:26:40.060829image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-09-24T13:26:34.125130image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:34.455109image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:34.664765image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-09-24T13:26:34.821654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Sample

First rows

CustomerIDProdTakenAgeTypeofContactCityTierDurationOfPitchOccupationGenderNumberOfPersonVisitingNumberOfFollowupsProductPitchedPreferredPropertyStarMaritalStatusNumberOfTripsPassportPitchSatisfactionScoreOwnCarNumberOfChildrenVisitingDesignationMonthlyIncome
0200000141.0Self Enquiry36.0SalariedFemale33.0Deluxe3.0Single1.01210.0Manager20993.0
1200001049.0Company Invited114.0SalariedMale34.0Deluxe4.0Divorced2.00312.0Manager20130.0
2200002137.0Self Enquiry18.0Free LancerMale34.0Basic3.0Single7.01300.0Executive17090.0
3200003033.0Company Invited19.0SalariedFemale23.0Basic3.0Divorced2.01511.0Executive17909.0
42000040NaNSelf Enquiry18.0Small BusinessMale23.0Basic4.0Divorced1.00510.0Executive18468.0
5200005032.0Company Invited18.0SalariedMale33.0Basic3.0Single1.00511.0Executive18068.0
6200006059.0Self Enquiry19.0Small BusinessFemale22.0Basic5.0Divorced5.01211.0Executive17670.0
7200007030.0Self Enquiry130.0SalariedMale33.0Basic3.0Married2.00201.0Executive17693.0
8200008038.0Company Invited129.0SalariedMale24.0Standard3.0Unmarried1.00300.0Senior Manager24526.0
9200009036.0Self Enquiry133.0Small BusinessMale33.0Deluxe3.0Divorced7.00310.0Manager20237.0

Last rows

CustomerIDProdTakenAgeTypeofContactCityTierDurationOfPitchOccupationGenderNumberOfPersonVisitingNumberOfFollowupsProductPitchedPreferredPropertyStarMaritalStatusNumberOfTripsPassportPitchSatisfactionScoreOwnCarNumberOfChildrenVisitingDesignationMonthlyIncome
4878204878135.0Self Enquiry117.0Small BusinessMale34.0Deluxe5.0Unmarried3.00401.0Manager24803.0
4879204879126.0Self Enquiry227.0Small BusinessFemale44.0Basic4.0Married2.01302.0Executive22347.0
4880204880159.0Self Enquiry128.0Small BusinessFemale44.0Deluxe4.0Married6.00312.0Manager28686.0
4881204881141.0Self Enquiry225.0SalariedMale32.0Basic5.0Married2.00112.0Executive21065.0
4882204882137.0Self Enquiry220.0SalariedMale35.0Basic5.0Married6.01512.0Executive23317.0
4883204883149.0Self Enquiry39.0Small BusinessMale35.0Deluxe4.0Unmarried2.01111.0Manager26576.0
4884204884128.0Company Invited131.0SalariedMale45.0Basic3.0Single3.01312.0Executive21212.0
4885204885152.0Self Enquiry317.0SalariedFemale44.0Standard4.0Married7.00113.0Senior Manager31820.0
4886204886119.0Self Enquiry316.0Small BusinessMale34.0Basic3.0Single3.00502.0Executive20289.0
4887204887136.0Self Enquiry114.0SalariedMale44.0Basic4.0Unmarried3.01312.0Executive24041.0